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Abstract —We present a compositional SMT-based algorithm 
for safety of procedural C programs that takes the heap into 
consideration as well. Existing SMT-based approaches are either 
largely restricted to handling linear arithmetic operations and 
properties, or are non-compositional. We use Constrained Horn 
Clauses (CHCs) to represent the verification conditions where the 
memory operations are modeled using the extenslonal theory of 
arrays (ARR). First, we describe an exponential time quantifier 
elimination (QE) algorithm for ARR which can Introduce new 
quantifiers of the index and value sorts. Second, we adapt the 
QE algorithm to efficiently obtain under-approximations using 
models, resulting in a polynomial time Model Based Projection 
(MBP) algorithm. Third, we Integrate the MBP algorithm into the 
framework of compositional reasoning of procedural programs 
using may and must summaries recently proposed by us. Our 
solutions to the CHCs are currently restricted to quantifier- 
free formulas. Finally, we describe our practical experience over 
SV-COMP’15 benchmarks using an implementation in the tool 
Spacer. 

I. Introduction 

Under-approximating a projection (i.e., existential quan¬ 
tification), for example in computing an image, is a key 
aspect of many techniques of symbolic model checking. A 
typical (though not ubiquitous) approach to this is what we 
call Model-based Projection (MBP) [17]: we generalize a 
particular point in the space of the image (obtained using 
a model) to a subset of the image that contains it. In some 
cases, the purpose is to compute the exact image by a series 
of under-approximations [12]. In other cases, such as IC3 [6], 
the purpose of MBP is to produce a relevant proof sub-goal. 
When the number of possible generalizations is finite, we say 
that we have a Jinite MBP which allows us to compute the 
exact image by iterative sampling, or to guarantee that the 
branching in our proof search is finite. 

The feasibility of a finite MBP depends on the underlying 
logical theory. Finite MBPs exist for propositional logic [12], 
[16] and Linear Integer Arithmetic (LIA) with a divisibility 
predicate [17], and have been applied in both hardware and 
software model checking. LIA is often adequate for software 
verification, provided that heap and array accesses can be 
eliminated. This can be done by abstraction, or by inlining 
all procedures and performing compiler optimizations to lower 
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memory into registers (e.g., [2], [15]). However, the inlining 
approach has many drawbacks. It can expand the program size 
exponentially, it cannot handle recursion, and it is not always 
feasible to eliminate heap and array accesses. 

We address this issue here by considering the problem of 
MBP for the extenslonal theory of arrays (ARR). We find that 
a finite MBP exists that can be computed in polynomial time 
when only array-valued variables are projected. Projecting 
variables of index and value sorts is not always possible, since 
the quantifier-free fragments of the theory combinations are 
not guaranteed to be closed under projection. We therefore take 
a pragmatic approach to MBP that may not always converge 
to the exact projection. This allows us to handle, for example, 
the combination of ARR and LIA. 

We test the effectiveness of this approach using the model 
checking framework of SPACER [17]. This SMT-based frame¬ 
work makes use of MBP to produce proof sub-goals for 
Hoare-style procedure-modular proofs of recursive programs. 
The ability to reason with ARR makes it possible to handle 
heap-allocating programs without inlining procedures, as the 
heap can be faithfully modeled using ARR [14]. This leads 
to significant improvements in scalability, when compared 
to the use of LIA alone with inlining, as measured using 
benchmark programs from the 2015 Software Verification 
Competition (SVCOMP 2015) [4]. Not inlining the programs 
also has the advantage that we generate procedure-modular 
proofs (containing procedure summaries) that might be re¬ 
usable in various ways (e.g., [11]). 

In summary, we (a) describe an exponential rewriting pro¬ 
cedure for projecting array variables (Sec. III-A), (b) adapt 
this procedure to obtain a polynomial-time (per model) finite 
MBP for projecting array variables (Sec. III-B), (c) integrate 
this with existing MBP procedures for Linear Arithmetic 
(Sec. III-C) in the SPACER framework obtaining a new com¬ 
positional proof search algorithm (Sec. IV), and (d) evaluate 
the algorithm experimentally using SVCOMP benchmarks 
(Sec. V). 

11. Preliminaries 

We consider a first-order language with equality whose 
signature S contains basic sorts (e.g., bool of Booleans, int 
of integers, etc.) and array sorts. An array sort arr(/, U) is 
parameterized by a sort of indices I and a sort of values V. 



We assume that I is always a basic sort. For every array sort 
arr{I,V), the language has the usual function symbols rd : 
arr{I,V)xI V and wr : arr(I,V)xIxV —> arr(I,V) 
for reading from and writing to the array. Intuitively, rd{a,i) 
denotes the value stored in the array a at the index i and 
wr{a, i, v) denotes the array obtained from a by replacing the 
value at the index i by v. We use the following axioms for 
the extensional theory of arrays (ARR): 

Read-after-write 

Va : arr(/, F) Vi,j : IMv :V 

(i = j rd{wr(a,i,v), j) = v) A 

rd{wr{a,i,v),j) = rd{a,j)) 

Extensionality 

Va, h : arr(/, V) ■ {Vi : I ■ rd{a, i) = rd{b, i)) => a = 6 

Intuitively, the first schema says that after modifying an 
array a at index i, a read results in the new value at index 
i and rd{a,j) at every other index j. The second schema 
says that if two arrays agree on the values at every index 
location, the arrays are equal. We use an over-bar to denote 
a vector. We write x : S to denote that every term in vector 
X has sort S, x{k) to denote the /cth component of x, and 
y dx to denote that y is equal to some component of x, i.e., 
VL^i y = x{k). Let i : I and v : V he vectors of index and 
value terms of the same length m. We write wr{a,i,v) to 
denote wr{wr{... wr{a,i{0),v{0)).. .),i{m),v{m)). Unless 
specified otherwise, S contains no other symbols. 

For arrays a and b of sort arr{I, V), and a (possibly 
empty) vector of index terms i, we write a =j b to denote 
Vj:I-[j^i r(i(a, j) = rd(6,j)) and call such formulas 
partial equalities [20]. Using extensionality, one can easily 
show the following 

a = b (1) 

(j e z A a 6) V 
[j ^lAa b A rd{b, j) = 

3v :V ■ a = wr{b, i, v) (3) 

We write ip{x) for a formula p with free variables x, and we 
treat cj) as a predicate over x. We also write tp\tV\ to to indicate 
that a term or formula t occurs in p at some syntactic position. 

Given formulas tpA {x, z) and (p b {y, z) with x lly = 0 and 
(pA => ‘Pb, a Craig Interpolant [7], denoted lTP((pA, ‘Pb )7 
is a formula (pi(z) such that (pA P’l and (pj =P ips- 

III. QE AND MBP FOR THE THEORY ARR 

By projection of a variable we mean elimination of an 
existential quantifier. Consider a formula ip of the form 
3x ■ pqf{x,y) where pgf is quantifier-free. The problem 
of quantifier elimination (QE) in p is to find a logically 
equivalent quantifier-free formula V’(v)- In this case, we say 
that Ip is the result of projecting x in pqf. 

A model-based projection (MBP) for p is an operator Proj 
that takes a model M of pqf and returns a quantifier-free 
formula ipMpfi) such that M ^ ptM and ipM entails p. The 


operator Proj is a finite MBP if its image is finite up to logical 
equivalence (that is, over all models we obtain only finitely 
many semantically distinct formulas).' In this case, we obtain 
the exact projection as the disjunction of the image of Proj. 
We will refer to Proj [M) as a generalization of M. 

In some cases, there is a trivial approach to MBP that we 
will call the substitution approach. We simply substitute for 
each variable x tt\ p a constant that is equal to x in the given 
model M (for example, a numeric literal). This approach was 
taken for propositional logic by Ganai et al. [12]. Eor theories 
that admit models of unbounded size (e.g., LIA), however, 
this does not yield a finite MBP, as the number of distinct 
generalizations we obtain can be infinite. 

Instead, we can take the approach used for Linear Real 
Arithmetic and LIA in our earlier work [17]. Suppose that for 
the given theory we have a QE procedure that produces a for¬ 
mula with an exponential (or higher) number of disjunctions. 
We can adapt this procedure to an MBP by always choosing 
just one disjunct that is true in the given model M. The result 
may be a procedure that is polynomial for any given model, 
though the number of distinct generalizations is exponential. 
We will show how to apply this idea for the projection of array¬ 
valued variables in the theory of arrays ARR. When combining 
this theory with LIA, we will find that some variables of index 
and value sorts must be eliminated by the substitution method, 
which gives us a useful MBP but not necessarily a finite MBP. 

A. Quantifier elimination for ARR 

Consider an existentially quantified formula 3a : 
arr(/, U) • p where p is quantifier-free. While we cannot 
always obtain an equivalent quantifier-free formula, our ob¬ 
jective here is to obtain an equivalent existentially quantified 
formula where every quantifier (if any) is of the sort / or V. 
As a simplification, we restrict the interpretations of I, the 
index sort, to infinite domains. Handling finite index domains 
requires a slight adaptation of the algorithms as described in 
Appendix A. 

ARRAYQE(3a • p) 

1 ((9i (ElimWr*)(3o • ip) 

2 V52 <— (CASESPLITEQ*; FACTORRD*)(v3i) 

3 (VLi LiftEqDiseqRd((P2) 

4 for k € [1, n] do 

5 |_ z/ife (ELIMEQ; ELIMDISEQ; ACKERMANN)((5fc) 

6 _ return Vfe=i V’fc 

Algorithm 1: QE for 3a • p, where a is an array variable. 

Our algorithm is inspired by the decision procedure for the 
quantifier-free fragment of ARR by Stump et al. [20]. At a 
high level, the QE algorithm proceeds in 3 steps: (i) eliminate 
write terms using the read-after-write axiom schema and par¬ 
tial equalities over arrays, (ii) eliminate (partial) equalities and 
disequalities over arrays, and (iii) eliminate read terms over 
arrays. Alg. 1 shows the pseudo-code for our QE algorithm 
ArrayQE using the rewrite rules in Eig. 1, 2, and 3. Each rule 

* MBP as defined in [17] corresponds to finite MBP here. 


a=inb = 
wr{a,j,v) =jb = 
a=jb = 



_ f[rd{wr{t,i,v),j)] _ 

{i=j A ip[v\) y {i^j A ip[rd{t,j)]) 


ElimWrEq 


_ =jt2] _ 

(j € i A ip[ti =j t2\) V 
(j A ip[ti t2Av = rd(t2,i)]) 


'^[ti = ^ 2 ] =7 

PartialEq —-- ti’s have array sort TrivEq - - —-— 

Wi =0 ^ 2 ] <p[T] 


e 

Symm —-— 

(p[t2 


M 

h] 


t 2 is a write term 
but ti is not 


ElimWr= (ElimWrRd I ElimWrEq | PartialEq | TrivEq | Symm) 

Fig. I: Rewriting rules to eliminate write terms. ElimWr denotes one of the rules chosen non-deterministically. 


CaseSplitEq-— 

3a • ((a 


3a • ^p[a =j t] 
t A (/5[T]) V (“'(a 


i) A(^[_L])) 


FactorRd 


_ 3a • y>\rd{a, t)] _ g jg fresh, t does not 

3a, s ■ ((^[s] A s = rd{a, t)) contain array terms 


Fig. 2: Rewriting rules to factor out equalities and read terms on the quantified array variable. 


ElimEq 


3a ■ (a =- t A y)) 


3t; • ^p[wr{t, i, v)/a] 

where a does not appear in t and v denotes fresh variables 


3a- L 2 A /\ -(a =j^ tk) 


ElimDiseq 


fe=l 


3a • tp 


where m G N, a does not appear in any tk, and 
a appears in (p only in read terms over a 


3a- 

Ackermann - 

P A {tk =tt => Sk = Si) 

l<k<i<m 

where m G N and a does not appear in p, sfc’s, or f^’s 
Fig. 3: Rewriting rules for QE of arrays. 


^ /\sk = rd{a,tk) 


k^l 


rewrites the formula above the line to the logically equivalent 
formula below the line. We use regular expression notation 
to express sequences of rewrites. In particular, Kleene star 
applied to a rule denotes the rule’s application to a fixed point. 

Line 1 of ArrayQE eliminates write terms using the 
rewrite rules in Fig. 1. Here ElimWr denotes a rule in Fig. 1 
chosen non-deterministically. ElimWrRd rewrites terms us¬ 
ing the read-after-write axiom and ElimWrEq rewrites partial 
equalities using Eq. (2). PartialEq converts equalities into 
partial equalities using Eq. (1). TrivEq eliminates trivial 
partial equalities with identical arguments and Symm ensures 
that write terms on the r.h.s. of equalities are also eliminated. 


Line 2 of ArrayQE rewrites the formula by case-splitting 
on partial equalities on the array quantifier a (via CASES- 
PLITEq) followed by factoring out read terms over a by 
introducing new quantifiers of sort V (via FactorRd). Note 
that, as presented, these two rules are not terminating as 
the partial equalities and read terms are preserved in the 
conclusion of the rules. However, one can easily ensure that a 
given partial equality or read term is considered exactly once 
by first computing the set of all partial equalities and read 
terms in the formula and processing them in a sequential order. 
The details are straightforward and are left to the reader. 

LietEqDiseqRd on line 3 of ArrayQE performs 
Boolean rewriting and returns an equivalent disjunction such 
that in every disjunct, the partial equalities, array disequalities, 
and equalities over read terms appear at the end as conjuncts, 
in that order. For each disjunct, line 5 applies the rules in 
Fig. 3 to eliminate the array quantifier a. ElimEq obtains 
a substitution term for a using the equivalence in Eq. (3). 
ElimDiseq is applicable when the disjunct contains no partial 
equalities and given that the domain of interpretation of I is 
infinite, one can always satisfy the disequalities and hence, 
they can simply be dropped. Ackermann performs the 
Ackermann reduction [1] to eliminate the read terms. 

Note that while the rewrite rules are applicable to all array 
terms and equalities in the original formula, in practice, we 
only need to apply them to eliminate the relevant terms 
containing the array quantifier a. See Fig. 4 for an illustration 
of ArrayQE on an example. 

Correctness and Complexity. We can show the following 
properties of ArrayQE (proof sketches in Appendix B). 

Theorem 1: ARRAYQE(3a : arr(/, V) ■ p) returns 3u : 
V - p, where p is quantifier-free and 3v ■ p = 3a ■ p. 

Theorem 2: ARRAYQE(3a - p) terminates in time expo¬ 
nential in the size of p. 












{ElimWrRd} 


3a- {b = wr{a, ii,vi) V (rrf(u;r(a, 12 , ^ 2 ), * 3 ) > 5 A rd{a, u) > 0)) 

(22 = is A (6 = wr{a, ii, fi) V (u 2 > 5 A rrf(a, 14 ) > 0))) V 
(i 2 ^ is A {b = wr(a, V (rd(a, is) > 5 A rd{a, U) > 0))) 

(12 = is A {{a =ii b A rd{b, ii) = vi) \/ {v 2 > 5 A rd{a, U) > 0))) V 

(*2 ^ is A {{a =i^ b A rd{b, ii) = vi) V {rd{a, is) > 5 A rd{a, u) > 0))) 

(*2 = is A {rd{b, ii) = vi V (v 2 > 5 A rd(a, u) > 0))) V 

(*2 7 ^ is A (rd(b, ii) = vi V (rd(a, is) > 5 A rd{a, U) > 0))) 

(*2 = is A (w 2 > 5 A rd{a, u) > 0)) V 
(i 2 7 ^ is A {rd(a, is) > 5 A rrf(a, 14 ) > 0)) 

/ \ 

(i 2 = is A (rii(fo, ii) = V (112 > 5 A S 4 > 0))) V 


= 3a- 


= 3a- 


= 3a- 


a =ii b A 
—i(a =ij^ b) A 


a =ij 6 A 


(i 2 7 ^ is A (r(i(b, ii) = -ai V (ss > 5 A S 4 > 0))) 


V 


/ 


= 3a, S 3 , S 4 - 


\ 


T(a =ii 6) A 


{h = is A (w 2 > 5 A S 4 > 0)) V 
(i 2 7 ^ is A (ss > 5 A S 4 > 0)) 


V 


7 


= 3a, S 3 , S4 - 


A S3 = rd{a, is) A S4 = rd{a, 24) 

(V3i A a =ij b A ss = rd{a, is) A S4 = rd{a, 24)) V 
(v32 a -i(a =ij^ b) A Ss = rd(a, is) A S4 = rd(a, 24)) 
3 ii,S3,S4 - {ifii A S 3 = rd{a,is) A S 4 = rd(a,i4)) [ii;r(b, ii, v)/a] V 

3a, S3, S4 - (v52 A S3 = rrf(a, is) A S4 = rd{a, ii)) 

3 ti,S 3 ,S 4 • (v5i A S 3 = rd{a,is) A S 4 = rii(a,i4)) [-oir(6,ii, w)/a] V 

3s3, S4 - {tp2 A (is = i4 Ss = S4)) 


{PARTIALEQ; ELIMWrEq} 


{CaseSplitEq} 


{FactorRd} 


{LiftEqDiseqRd} 

{ElimEq} 

{ElimDiseq} 

{Ackermann} 


Fig. 4: Illustrating ArrayQE on an example. 


B. Model Based Projection 

In this section, we will assume that for a satisfiable formula 
we can obtain a finite representation of a model of the formula 
and that we can effectively evaluate the truth of any formula in 
this model. This is possible for ARR and its combinations with 
LI A and propositional logic. The ability to evaluate allows 
us to strengthen a formula in a way that preserves a given 
model. Suppose we have a formula (p[tjji V 1 ^ 2 ] with model 
M, where the sub-formula i/ji V '02 occurs positively (under 
an even number of negations) in (p. If we also have M |= '0i, 
then M ^ ‘fil'fj’i] and clearly, entails ip. This gives us a 

way to eliminate a disjunction while preserving a given model 
and maintaining an under-approximation. If neither tpi nor 02 
is true in M, we can similarly replace p with p\3-]. These 
transformations are expressed as MBP rewrite rules in Fig. 5. 

For each QE rule R, we can produce a corresponding under¬ 
approximate rule Rm that preserves model M. This rule can 
be written R ; (MbpLeft | MbpRight | MbpVac)*. In 
practice, we can choose to only apply the MBP rules to 
disjunctions introduced by the QE rules and not to those 
originally occurring in p. Correspondingly, we can convert 
our QE algorithm ArrayQE to ArrayQE^ by replac¬ 
ing each rule R with Rm- We can then obtain an MBP 
ArrayMBP((/j)(M) = ARRAYQE^(y)) and we can show 


the following; 

Theorem 3: Eor any quantifier-free formula p in ARR, 
ARRAYMBP(3a : arr(/, C). p) is a finite MBP. 

The fact that it is an MBP can be easily shown by induction 
on the number of rewrites applied. The fact that it is finite 
derives from the fact that there are only finitely many ways to 
resolve the disjunctions in the QE result. 

Moreover, assuming that the evaluation of a formula in 
a model can be done in polynomial time, we can evaluate 
ArrayMBP((/?)(M) in time that is polynomial in the size of 
M and the size of p. This is because we can polynomially 
bound the number of times each rule Rm applies, and each 
rule can only expand the formula size by a constant amount. 
Eig. 6 shows an example of applying ArrayMBP. 

C. MBP for ARR+LIA 

We now consider the combination of the ARR and LIA 
theories. Assume that the only basic sorts are bool and 
int. Eurthermore, we only consider linear functions over int 
along with a divisibility predicate (with constant divisors). 
We developed a hnite MBP for LIA in a previous work [ 1 7] 
(call it LiaMBP). When the index sort I is int, one can 
obtain a more efficient MBP with a slight modihcation of 
AckermanNm (for eliminating array read terms) that utilizes 
the predicate symbol <. Given a model M of the formula. 









{WrRDm, M \= i2 ^ is} 
{PARTIALEQ; WrEQm} 

{CaseEQm, M Y=a=ii b} 


3a- (6 


= wr{a, ii,vi) V {rd{wr{a,i 2 ,V 2 ),is) > 5 A rd[a, u) > 0)) 


3a • {i 2 7 ^ *3 A (6 = wr{a, ii, wi) V {rd{a, is) > 5 A rd{a, U) > 0))) 

3a • (12 7^ 13 A ((a =i^ b A rd{b, ii) = vi) V {rd{a, is) > 5 A rd(a, u) > 0))) 
3a • -i(a =ij^ b) Ah ^ is A {rd{a, is) > 5 A rd{a, h) > 0) 

/ \ 


3a, S3, S4 ■ 


n(a =ii b) Ah ^ is A(s 3 > 5 A S4 > 0) 


V 


A S3 = rd{a, is) A S4 = rd{a, h) 

3a, S3, S4 • (v?2 A -i(a =ij b) A ss = rd(a, is) A S4 = rd(a, h)) 
3a, S3, S4 • (v?2 A S3 = rd{a, h) A S4 = rd{a, U)) 

3s3, S4 • (922 A {is = 24 A S3 = S4)) 


{Factored} 


{LiftEqDiseqRd} 
{ElimDiseq} 
(AcKm, M \= is = u} 


Fig. 6: Illustrating ArrayMBP on the example of Fig. 4 with a given model M. 


MbpLeet 


ip[ipi V'ijj2] M\=ip,'ijji 
‘fibPi] 


MbpRight 


V V' 2 ] M \= if, 'Ip2 


’-pii’i V V' 2 ] M \= M ^ t/)i, '02 

Mb p Vac- 

</5[-L] 


Fig. 5: MBP rules for formulas in negation-normal form. 


one can first partition the set of index terms t^’s according 
to their interpretations in M and choose a representative for 
each equivalence class. Then, the conjunction in the result of 
the rule is modified as follows: (a) for every equivalence class, 
add the equality tk = ti for every non-representative ti, where 
tk is the representative, (b) linearly order the representatives 
and add the corresponding inequalities. The modified rule (and 
hence, the resulting MBP) is linear in time and space. 

However, the combination of arrays and integers introduces 
terms over the combined signature which need to be handled 
as well. For example, there is no equivalent quantifier-free 
formula for 3i : int • rd{a,i) > 0. This implies that there 
does not exist a finite MBP for the combination of LIA and 
ARR. In the example, the only way to under-approximate the 
quantification is to use the substitution method, replacing i 
with its interpretation in a model M \= rd{a,i) > 0 as a 
numeric literal. 

Based on the above observations, we obtain an MBP for 
ARRh-LIA as follows. First, we apply ArrayMBP, using the 
modified AckermanNm above, to eliminate array quantifiers. 
Then, we use LiaMBP to eliminate integer quantifiers that do 
not appear in any array term. Finally, we use the substitution 
method to eliminate any remaining integer quantifiers. When 
the last step of substitution method is not necessary, the 
resulting MBP will be finite. 


IV. The Compositional Verieication Framework 

MBP plays a crucial role in enabling the search for compo¬ 
sitional proofs. In this section, we will consider the role played 
by MBP in a model checking framework called SPACER [17]. 
In this framework, MBP is used to create succinct localized 
proof sub-goals that make it possible to reason about only 
one procedure at a time. The proof goals take the form of 
under-approximate summaries, either of the calling context of 
a procedure or of the procedure itself. Without some form 
of projection, SPACER would not be compositional, as it 
would build up formulas of exponential size, in effect inlining 
procedures to create bounded model checking formulas. 

A. Modeling programs with CHCs 

Spacer checks safety of procedural programs by reducing 
the problem to SMT of a special kind of formulas known as 
Constrained Horn Clauses (CHCs) [5], [17], [14]. We augment 
the signature S with a set of fresh predicate symbols V. A 
Constrained Horn Clause (CHC) is a formula of the form 

m 

Vir • /y Pk{xk) /\ ip{x) head 

k=l 

'• -V-' 

body 

where for each k, Pk is a symbol in V, Xk C x and \xk\ 
is equal to the arity of Pk- The constraint is a formula 
over <S, and head is either an application of a predicate in 
V or another formula over S. We use body to refer to the 
antecedent of the CHC, as shown above. A CHC is called 
a query if head is a formula over S and otherwise, it is 
called a rule. If m < 1 in the body, the CHC is linear and 
is non-linear otherwise. Following the convention of logic 
programming literature, we also write the above CHC as 
head <- Pi{xi),Pmixm), V’ix). 

Intuitively, each predicate symbol Pk represents an unknown 
partial correctness specification of a procedure (that is, an 
over-approximate summary). A query defines a property to be 
proved, while each rule gives modular verification condition 
for one procedure. A satisfying assignment to the symbols Pk 








is thus a certificate that the program satishes its specification 
and corresponds to the annotations in a Floyd/Hoare style 
proof. In this work, we are interested in finding annotations 
that can be expressed in the quantifier-free fragment of our 
first-order language, to avoid the difficulty of reasoning with 
quantifiers. 

Any given set of CHCs encoding safety of procedural 
programs can be transformed to an equisatisfiable set of just 
three CHCs with a single predicate symbol (encoding the 
program location using a variable). These CHCs have the 
following form; 

Inv{x) ■‘r- init{x) -<bad{x) ■(— Inv{x) 

Inv{x') ■(— Inv{x),Inv{x°),tr{x,x°,x') 

Intuitively, Inv is the program invariant, x denotes the pre¬ 
state of a program transition, x' denotes the post-state, and 
x° denotes the summary of a procedure call (if one is made). 
If there are no procedure calls, tr is independent of x° and 
Inv{x°) can be dropped; in this case Inv denotes an inductive 
invariant of an ordinary transition system. In the sequel, we 
restrict to this normal form and consider only quantiher-free 
interpretations of the predicate Inv. 

It is useful to rewrite the above rules using a function T 
that substitutes given predicates (j)A{x) and 4>b{x) for the 
occurrences of Inv in the rule bodies. That is, let 

={(Pa{x) A(Pb(s°) a tr{x,x°,x')) 

V initifx') 

The rules are thus equivalent to T{Inv,Inv) Inv{x). 
Abusing notation, we will also write IF{tpA) for iF{tfA,<PA). 

B. The Spacer framework 

Spacer is a general framework that can be instantiated for 
a given logical theory T by supplying three elements; (a) a 
model-generating SMT solver for T, (b) an MBP procedure 
Mbp for T and (c) in interpolation procedure ITP for T. 
Compared to other SMT-based algorithms (e.g., [3], [13], [10], 
[18]), the key distinguishing feature of SPACER is compo¬ 
sitional reasoning. That is, instead of checking satisfiability 
of large formulas generated by program unwinding, SPACER 
iteratively creates and checks local reachability queries for 
individual procedures. In this way it is similar to IC3 [6], [9], 
a SAT-based algorithm for safety of finite-state transition sys¬ 
tems, and GPDR [16], its extension to Linear Real Arithmetic. 
Like these methods, SPACER maintains a sequence of over¬ 
approximations of procedure behaviors, called may summaries, 
corresponding to program unwindings. However, unlike other 
approaches, SPACER also maintains under-approximations of 
procedure behaviors, called must summaries, to avoid redun¬ 
dant reachability queries. Another distinguishing feature of 
Spacer is the use of MBP for efficiently handling existentially 
quantified formulas to create a new query or a must summary. 
We note, however, that MBP is a general technique and can 
be exploited in IC3/PDR as well.^ 

^Arguably sub-goal creation in ICS is a simple MBP for propositional logic. 


Alg. 2 gives a simplihed description of SPACER as a solver 
for CHCs in the form of (4) (though SPACER handles general 
CHCs). It is described using a set of rules that can be applied 
non-deterministically. Each rule is presented as a guarded 
command “[ grd ] cmd”, where cmd can be executed only 
if grd holds. 

Input; Formulas init{x),tr{x,x°,x'),bad{x) 

Output; Inductive invariant (FO interpretation of Inv 
satisfying ( 4 )) or UNSAFE 

if {init A bad) satisfiable then return UNSAFE 
// initialize data structures 

Q := 9 //set of pairs € N 

N := 0 // max level, or recursion depth 

Oq = init, Oi = T, yi > 0 // may summary sequence 

lA = init // must summary 

forever non-deterministically do 

(Candidate) [ {On A bad) satisfiable ] 

Q ;= <3 U {p, N), for some p =P On A bad 
(DecideMust) [(</;,*+ 1) e Q, M \= T{Oi,U)Ap' ] 
Q ;= (3U(Mbp(3x°,x'-J'( e>„W)A<p',M),i) 
(DecideMay) [ {p,i+ 1) & Q, M \= F{Oi) A p' ] 

Q ■.= Qil (MBP(3x,iE' • F{Oi) A p', M)\x/x°],i) 
(Leaf) [ {p,i) e Q, ^ -^p', i<N^ 

Q ■.= Q yj {p,i 1) 

(Successor) [ {p,i -\-1) ^ Q, M \= F{U) A p' ] 
U-.^Uy Mbp(3x,x° • TilA) A p',M)[x/x'] 
(Conflict) [ -b 1) G Q, F{Oi) ^p' ] 

Oj ;= Oj A lT:p{F{Oi),^p')\x/x'], Vj < i -b 1 
(Induction) [ {pV fi) € Oi, F{p A Oi) p' ] 

Oj ;= Oj /\p,yj <i-\-l 

(Unfold) [ On -^bad ] N :=N +l 
(Safe) [ Oj+i =b Oj ] return invariant O^ 

(Unsafe) [ {U A had) satisfiable ] return Unsafe 
Algorithm 2: Rule-based description of SPACER. 

As shown in Alg. 2 , SPACER maintains a set of reachability 
queries Q, a sequence of may summaries {OijjgNj and a 
must summary U. Intuitively, a query {p, i) corresponds to 
checking if p is reachable for recursion depth i, Oi over¬ 
approximates the reachable states for recursion depth i, and U 
under-approximates the reachable states. N denotes the current 
bound on recursion depth. The sequence of may summaries 
and N correspond to the trace of approximations and the 
maximum level in IC3/PDR, respectively. For convenience, 
let 0-1 be _L. Mbp(;/j,M), for a formula p = 3v ■ pqf and 
model M \= pgf, denotes the result of some MBP function 
associated with p for the model M. 

Alg. 2 initializes to 0 and, Oq and hi to init. Candidate 
initiates a backward search for a counterexample beginning 
with a set of states in bad. The potential counterexample is ex¬ 
panded using either DecideMust or DecideMay. DecideMust 
jumps over the call Inv{x°), in the last CHC of (4), utilizing 
the must summary 14. DecideMay, on the other hand, creates a 
query for the call using the may summary of its calling context. 



Successor updates U when a query is known to be reachable. 
The other rules are similar to ICS [6] and GPDR [16] and 
we skip their explanation in the interest of space. SPACER is 
sound and if Mbp utilizes finite MBP functions, SPACER also 
terminates for a fixed N [17]. 

C. Instantiation for ARR +LIA 

In instantiating this framework for ARR+LIA, the key 
ingredient is the MBP procedure of the previous section. An 
interpolation procedure ITP can be trivially obtained by using 
literal-dropping approach based on UNSAT cores, or a more 
sophisticated approach can be taken (e.g., see [16], [18]). 

Because we do not have a finite MBP, SPACER is not 
guaranteed to terminate even for a fixed bound on the recursion 
depth N. That is, it can generate an infinite sequence of 
queries and must summaries. Note that MBP is used in 3 
rules: DecideMay, DecideMust, and Successor. The elim¬ 
ination of quantifiers in Successor is only an optimization 
and can be avoided. This is not the case with DecideMay 
or DecideMust without changing the structure of the queries, 
the considerations of which are outside the scope of this paper. 
In the following, we identify restrictions on the CHCs where 
termination is still guaranteed and for the other cases, we 
propose some heuristic modifications to Mbp and ITP to help 
avoid divergence. 

1) Equality resolution in Mbp.- There are several cases 
where terms over combined signatures appear in conjunction 
with equality terms over the index quantifier, e.g., 3i : intT = 
t A rd{a, i) > 0 for a term t independent of i. In these cases, 
the quantifier can be eliminated using equality resolution, e.g., 
rd{a,t) > 0 in the above example. Such cases seem to be 
natural in the case of a single procedure, i.e., when tr in 
(4) is independent of x°. Consider a disjunct 5 in a DNF 
representation of tr. Now, 6 represents a path in the procedure 
and typically, index terms (in reads and writes) in 6 can be 
ordered such that every index term is a function of the previous 
index terms or the current-state variables x. This makes it 
possible to eliminate any index variables in x' using equality 
resolution as mentioned above. 

2) Privileging array equalities: Here is a simple example 
that exhibits non-termination: 

Inv{a, b) a = b 

_L •<— Inv{a, b), rd{a,j) < 0, rd{b,j) > 0 

Here, intuitively, Inv{a,b) denotes the summary of a proce¬ 
dure which takes an array a as input and produces b as output 
and we are interested in checking if there is sign change in the 
value at an index j as a result of the procedure call. For this 
example, DecideMay creates queries of the form rd{a,k) < 
0 A rd{b, k) > 0 where fc is a specific integer constant. If ITP 
returns interpolants of the form rd{a, k) = rd{b, k), it is easy 
to see that SPACER would not terminate even for N = 0, even 
though there is a trivial solution: a = b. 

To alleviate this problem, we modify Mbp and ITP to 
promote the use of array equalities in interpolants. Let ip be 
the result of Mbp for a given model M. For every pair of 


array terms a, b in ip, we strengthen ip with the array equality 
a = b or disequality a f b, depending on whether M \= a = b 
holds or not. In the above example, the queries will now be 
of the form rd{a, k) < 0 A rd{b,k) > 0 A a f b. However, 
rd{a,k) = rd{b,k) continues to be an interpolant whereas 
the desired interpolant is a = b. To reduce the dependence on 
specific integer constants in the learned interpolants, and hence 
in the may summaries, we modify ITP as follows. Suppose we 
are computing an interpolant for ip -^^p' (as occurs in 

Conflict). We let ip = tp\Aipi where (/?2 contains all the literals 
where an integer quantifier is substituted using its interpreta¬ 
tion in a model. Using a minimal unsatisfiable subset (MUS) 
algorithm, we can generalize (^2 to ip 2 such that ipA{ipi A ^ 2 )' 
is unsatisfiable and then obtain iTPj -^,-1 ((/?i A:^ 2 ) 0 - In the 
above example, for iV = 0 we have ip = {a = b), 
!pi = {a f b), and (p 2 = rd{a,k) < 0 A rd(b,k) > 0. 
One can show that (^2 is simply T and the only possible 
interpolant is a = 6. In our implementation, we add such 
(dis-)equalities on-demand in a lazy fashion. Note that adding 
such (dis-)equalities to the queries is only a heuristic and may 
not always help with termination. 

V. Experimental Results 

As noted in the introduction, the array theory allows us to 
model heap references accurately. This eliminates the need to 
inline procedures so that heap-allocated objects are reduced to 
local variables. We hypothesize that the resulting increase in 
modularity will allow SPACER to more efficiently verify pro¬ 
cedural programs using ArrayMbp, in spite of the potential 
for divergence due to non-finiteness of the MBP. 

We test this hypothesis using a prototype implementation 
of Spacer with ArrayMbp.^ To verify C programs, we 
use SeaHorn [14], which uses the LLVM infrastructure to 
compile and optimize the input program, then encodes the 
verification conditions as CHCs in the SMT-LIB2 format. 
SeaHorn can optionally inline procedure calls before encod¬ 
ing, allowing us to test our hypothesis regarding modularity. 

For reference, we also compare SPACER to the implemen¬ 
tation of GPDR [16] in Z3 [8]. A key difference between 
Spacer and GPDR is that the latter does not use must 
summaries. Z3 also uses MBP, but is limited to equality 
resolution and the substitution method. As a result Z3 GPDR 
is effective only for inlined programs. 

We use benchmarks from the software verification compe¬ 
tition SVCOMP’15 [4]. We considered the 215 benchmarks 
from the Device Drivers category where Z3 GPDR (with inlin¬ 
ing) needed more than a minute of runtime or did not terminate 
within the resource limits of SVCOMP [15]. All experiments 
have been carried out using a 2.2 GHz AMD Opteron(TM) 
Processor 6174 and 516GB RAM, running Ubuntu Linux. Our 
resource limits are 30 minutes and 15GB for each verification 
task. In the scatter plots that follow, a diamond indicates a 
time-out, a star indicates a mem-out, and a box indicates an 
anomaly in the implementation. 

^https://bitbucket.org/spacer/code 
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Fig. 7: Advantage of inter-procedural encoding using SPACER. 
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8: Spacer vs. Z3 on hard SVCOMP benchmarks with inlining. 


The scatter plot in Fig. 7 compares the combined run time 
for the CHC encoding and verification, when inlining is turned 
on and off. A clear advantage is seen in the non-inlining case. 
This shows that SPACER is able to effectively exploit the addi¬ 
tional modularity that is made possible by ArrayMBP, and 
that this advantage outweighs any occurrences of divergence 
due to non-finite MBP."^ We note that SPACER with only LIA 
is able to handle only a small fraction of the non-inlined 
benchmarks. This result confirms our hypothesis. 

For reference, we also compare to the performance of 
Z3 GPDR. We observed that without ArrayMBP, Z3 is 
very ineffective in the non-inlined case. We should mention, 
however, that of the 7 unsafe programs verified by Z3, 5 could 
not be verified by SPACER. Fig. 8 compares SPACER and Z3 
with inlining on. This shows an overwhelming advantage for 
Spacer, which is due to its more effective MBP approach. 

VI. Related Work 

There are several SMT-based approaches for sequential 
program verification that iteratively check satisfiability of 
formulas corresponding to safety of various unwindings of the 
program [3], [13], [10], [18]. However, these monolithic SMT 
formulas can grow exponentially. In contrast, the SPACER 
framework [17] we use allows us to do a compositional proof 
search for safety. Such local proof search is also found in 
the IC3 algorithm for hardware model checking [6] and its 

^Unfortunately, we have no way to distinguish divergence from timeouts. 


extensions to software model checking (e.g., [16]), although 
Spacer is the first to use under-approximate summaries of 
procedures for avoiding redundant proof sub-goals. Model- 
based generalizations have also been used to obtain projections 
efficiently in decision procedures for quantified formulas [19]. 

VII. Conclusion and Future Work 
We have presented a procedure for existentially projecting 
array variables from formulas over combined theories of 
ARR, LIA, and propositional logic. We have adapted the 
procedure to a finite MBP for array variables. While existential 
projection is worst-case exponential, the corresponding MBP 
is polynomial. However, projecting arrays might introduce 
new existentially quantified variables (whose sort is the same 
as the index- or value-sort of the eliminated array). For 
projecting these variables, a finite MBP need not exist. We 
described heuristics for obtaining a practical (but not nec¬ 
essarily finite) MBP procedure, obtaining an instantiation of 
the Spacer framework for verification of safety of sequential 
heap-manipulating programs. We show that the new variant of 
Spacer is effective for constructing compositional proofs of 
Linux Device Drivers. In the future, we plan to extend these 
ideas for handling more complex heap-manipulating programs 
that require universal quantifiers in the program invariants. 
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3a • {^{a =7 t) A (p) 

ELIMDISEQFINITE - - -=-7 

3a, j • [rd{a,j) ^ rd{t,j)Aj ^iAp) 

where a does not appear in t 
Fig. 9: Modified version of ElimDiseq for finite domains. 


Appendix A 

QE AND MBP FOR ARR OVER Finite Index Domains 

When finite interpretations of I are allowed, ElimDiseq is 
no longer an equivalent transformation as there may not exist 
an index where the arrays in the disequalities disagree on the 
values. However, one can use extensionality to obtain another 
equivalent transformation rule ElimDiseqFinite, as shown 
in Fig. 9. As this rule introduces new read terms over a, we 
need to apply FactorRd once again before Ackermann. 
Also, note that the result of QE and MBP is now of the form 
3i : I: V ■ -ip. 

Appendix B 

Proofs of statements about ArrayQE and 
ArrayMBP 

Theorem 1: ARRAYQE(3a : arr(J, V) ■ ip ) returns 3v : 
V ■ p, where p is quantifier-free and 3v ■ p = 3a ■ p. 

Proof: {Sketch) One can easily show that the rules in 
Fig. 1, 2, and 3 are equivalence preserving. The theorem 
follows immediately. ■ 

Theorem 2: ARRAYQE(3a • p) terminates in time expo¬ 
nential in the size of p. 

Proof: (Sketch) Line 1 of ArrayQE essentially elim¬ 
inates write terms one by one and can be easily shown to 
terminate. Line 2 can be easily made to terminate by iterating 
over all partial equality and read terms. The remaining steps 
of the algorithm clearly terminate as well. 

The complexity analysis is similar to that of the decision 
procedure by Stump et al. [20]. Let N be the size of p. The 
number of disjuncts generated by any rewrite rule is bounded 
by N (due to the disjunction j € i on indices in ElimWrEq). 
Disjunctions can be generated by the rules for every write term 
or partial equality and their number is bounded by N. So, 
the total number of disjunctions generated by the algorithm is 
bounded by 0{N^) which is exponential in N. The size of 
a disjunct generated by a rule can be shown to be bounded 
by a polynomial in N. CaseSplitEq can be efficiently 
implemented using an {N + l)-way case analysis over all N 
partial equalities at once avoiding a Boolean rewriting on line 3 
of the algorithm. That is, one can obtain N +1 disjuncts, one 
each for the case of a partial equality being true and the last 
one for the case of every partial equality being false. Thus, 
the complexity of ArrayQE is exponential in N. ■ 



